Use of Bad Training Data for Better Predictions

نویسندگان

  • Tal Grossman
  • Alan S. Lapedes
چکیده

We show how randomly scrambling the output classes of various fractions of the training data may be used to improve predictive accuracy of a classification algorithm. We present a method for calculating the "noise sensitivity signature" of a learning algorithm which is based on scrambling the output classes. This signature can be used to indicate a good match between the complexity of the classifier and the complexity of the data. Use of noise sensitivity signatures is distinctly different from other schemes to avoid overtraining, such as cross-validation, which uses only part of the training data, or various penalty functions, which are not data-adaptive. Noise sensitivity signature methods use all of the training data and are manifestly data-adaptive and non-parametric. They are well suited for situations with limited training data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking of units by anti-ideal DMU with common weights

Data envelopment analysis (DEA) is a powerful technique for performance evaluation of decision making units (DMUs). One of the main objectives that is followed in performance evaluation is discriminating among efficient DMUs to provide a complete ranking of DMUs. DEA successfully divides them into two categories: efficient DMUs and inefficient DMUs. The DMUs in the efficient category have ident...

متن کامل

شبکه عصبی مصنوعی برای ارزیابی خطر اختلالات حرکتی در نوزادان

 Background: Prediction of developmental disorders in infancy is very important. This study aimed to predict movement disorders of children using Artificial Neural Network (ANN) model. Methods: This was a retrospective study, in which 600 infants with normal and 120 infants with abnormal neurologic examination were evaluated. For analysis, the data divided the study group randomly int...

متن کامل

Enhanced Predictions of Tides and Surges through Data Assimilation (TECHNICAL NOTE)

The regional waters in Singapore Strait are characterized by complex hydrodynamic phenomena as a result of the combined effect of three large water bodies viz. the South China Sea, the Andaman Sea, and the Java Sea. This leads to anomalies in water levels and generates residual currents. Numerical hydrodynamic models are generally used for predicting water levels in the ocean and seas. But thei...

متن کامل

Use of artificial neural networks to estimate installation damage of nonwoven geotextiles

This paper presents a feed forward back-propagation neural network model to predict the retained tensile strength and design chart in order to estimation of the strength reduction factors of nonwoven geotextiles due to installation process. A database of 34 full-scale field tests were utilized to train, validate and test the developed neural network and regression model. The results show that t...

متن کامل

مقایسه مدل شبکه عصبی مصنوعی و رگرسیون پارامتری در پیش‌بینی بقای بیماران مبتلا به سرطان معده

Background & Objective: Using parametric models is common approach in survival analysis. In the recent years, artificial neural network (ANN) models have increasingly used in survival prediction. The aim of this study was to predict of survival rate of patients with gastric cancer by using a parametric regression and ANN models and compare these methods. Methods: We used the data of 436 gast...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993